Research Question¶
How do funding disparities suggest an underrepresentation of Black and Hispanic youth in traditional U.S public schools in high-cost sports?
Problem Statement¶
Sports specialization involves year-round training and competition, and requires costly investments towards participation, travel, and equipment fees, which creates significant finanicial barriers for youth from lower socioeconomic backgrounds. Aside from this, public school funding disparities can limit access to appropriate facilities, personnel, or physical education, which could further hinder sports participation opportunities for youth in lower SES communities. These disparities can contribute to underrepresentation of Black or Hispanic youth in sports with high financial barriers -- hockey, gymnastics, tennis, etc., while sports such as track and field are less expensive, and therefore more accessible.
Potential Subtopics¶
- Correlation between public school funding and facility quality
- Connection between SES and physical activity/education
Data Definition¶
Public School Characteristics 2022-23
Last Updated: October 21, 2024
https://catalog.data.gov/dataset/public-school-characteristics-2022-23-451db
The National Center for Education Statistics (NCES) gathers demographic and geographic data about U.S public schools and factors such as enrollment and Title I status. Further information consists of the percentage of students with free or reduced lunch eligibility. By researching both this dataset and the YRBSS, researchers could analyze patterns between students or schools with a lower SES and the rates of physical activity rates.
Additional Datasets of Interest¶
Nutrition, Physical Activity, and Obesity - Youth Risk Behavior Surveillance System
Last Updated: February 4, 2025
Conducted by the Centers for Disease Control and Prevention (CDC), the Youth Risk Behavior Surveillance System (YRBSS) monitors health behaviors in middle and high school students nationwide. It collects data regarding physical activity and nutrition, along with geographic and socioeconomic factors. By collecting this data, it could be used to further research on the impact socioeconomic factors have on health behaviors.
Data Collection¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
import warnings
warnings.filterwarnings('ignore')
Read the Data¶
path = pd.read_csv('Public_School_Characteristics_2022-23.csv')
psChar_23 = pd.DataFrame(path)
psChar_23.head(7)
| X | Y | OBJECTID | NCESSCH | SURVYEAR | STABR | LEAID | ST_LEAID | LEA_NAME | SCH_NAME | LSTREET1 | LSTREET2 | LCITY | LSTATE | LZIP | LZIP4 | PHONE | CHARTER_TEXT | VIRTUAL | GSLO | GSHI | SCHOOL_LEVEL | STATUS | SCHOOL_TYPE_TEXT | SY_STATUS_TEXT | ULOCALE | NMCNTY | TOTFRL | FRELCH | REDLCH | DIRECTCERT | PK | KG | G01 | G02 | G03 | G04 | G05 | G06 | G07 | G08 | G09 | G10 | G11 | G12 | G13 | UG | AE | TOTMENROL | TOTFENROL | TOTAL | MEMBER | FTE | STUTERATIO | AMALM | AMALF | AM | ASALM | ASALF | AS | BLALM | BLALF | BL | HPALM | HPALF | HP | HIALM | HIALF | HI | TRALM | TRALF | TR | WHALM | WHALF | WH | LATCOD | LONCOD | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -86.206200 | 34.26020 | 1 | 10000500870 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Middle School | 600 E Alabama Ave | NaN | Albertville | AL | 35950 | (256)878-2341 | No | Not Virtual | 07 | 08 | Middle | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 697 | 654 | 43 | 587 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 440.0 | 450.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 459.0 | 431.0 | 890.0 | 890.0 | 45.000000 | 19.78 | 4.0 | 1.0 | 5.0 | 4.0 | 2.0 | 6.0 | 15.0 | 14.0 | 29.0 | 0.0 | 1.0 | 1.0 | 251.0 | 251.0 | 502.0 | 17.0 | 15.0 | 32.0 | 168.0 | 147.0 | 315.0 | 34.26020 | -86.206200 | |
| 1 | -86.204900 | 34.26220 | 2 | 10000500871 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville High School | 402 E McCord Ave | NaN | Albertville | AL | 35950 | 2322 | (256)894-5000 | No | Not Virtual | 09 | 12 | High | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 1254 | 1178 | 76 | 1059 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 493.0 | 442.0 | 390.0 | 387.0 | NaN | NaN | NaN | 868.0 | 844.0 | 1712.0 | 1712.0 | 85.199997 | 20.09 | 0.0 | 2.0 | 2.0 | 4.0 | 5.0 | 9.0 | 23.0 | 34.0 | 57.0 | 0.0 | 0.0 | 0.0 | 490.0 | 468.0 | 958.0 | 26.0 | 19.0 | 45.0 | 325.0 | 316.0 | 641.0 | 34.26220 | -86.204900 |
| 2 | -86.220100 | 34.27330 | 3 | 10000500879 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Intermediate School | 901 W McKinney Ave | NaN | Albertville | AL | 35950 | 1300 | (256)878-7698 | No | Not Virtual | 05 | 06 | Middle | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 718 | 665 | 53 | 570 | NaN | NaN | NaN | NaN | NaN | NaN | 412.0 | 462.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 451.0 | 423.0 | 874.0 | 874.0 | 43.000000 | 20.33 | 1.0 | 4.0 | 5.0 | 4.0 | 0.0 | 4.0 | 22.0 | 28.0 | 50.0 | 0.0 | 0.0 | 0.0 | 263.0 | 241.0 | 504.0 | 7.0 | 6.0 | 13.0 | 154.0 | 144.0 | 298.0 | 34.27330 | -86.220100 |
| 3 | -86.221806 | 34.25270 | 4 | 10000500889 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Elementary School | 145 West End Drive | NaN | Albertville | AL | 35950 | (256)894-4822 | No | Not Virtual | 03 | 04 | Elementary | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 723 | 680 | 43 | 583 | NaN | NaN | NaN | NaN | 430.0 | 444.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 463.0 | 411.0 | 874.0 | 874.0 | 43.000000 | 20.33 | 0.0 | 4.0 | 4.0 | 1.0 | 3.0 | 4.0 | 22.0 | 16.0 | 38.0 | 0.0 | 0.0 | 0.0 | 261.0 | 236.0 | 497.0 | 11.0 | 16.0 | 27.0 | 168.0 | 136.0 | 304.0 | 34.25270 | -86.221806 | |
| 4 | -86.193300 | 34.28980 | 5 | 10000501616 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Kindergarten and PreK | 257 Country Club Rd | NaN | Albertville | AL | 35951 | 3927 | (256)878-7922 | No | Not Virtual | PK | KG | Elementary | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 392 | 367 | 25 | 240 | 133.0 | 473.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 304.0 | 302.0 | 606.0 | 606.0 | 26.000000 | 23.31 | 1.0 | 3.0 | 4.0 | 2.0 | 0.0 | 2.0 | 26.0 | 23.0 | 49.0 | 0.0 | 0.0 | 0.0 | 167.0 | 152.0 | 319.0 | 4.0 | 4.0 | 8.0 | 104.0 | 120.0 | 224.0 | 34.28980 | -86.193300 |
| 5 | -86.221800 | 34.25330 | 6 | 10000502150 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Primary School | 1100 Horton Rd | NaN | Albertville | AL | 35950 | 2532 | (256)878-6611 | No | Not Virtual | 01 | 02 | Elementary | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 779 | 726 | 53 | 617 | 0.0 | NaN | 427.0 | 517.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 498.0 | 446.0 | 944.0 | 944.0 | 61.000000 | 15.48 | 9.0 | 1.0 | 10.0 | 3.0 | 0.0 | 3.0 | 24.0 | 21.0 | 45.0 | 0.0 | 1.0 | 1.0 | 290.0 | 256.0 | 546.0 | 9.0 | 10.0 | 19.0 | 163.0 | 157.0 | 320.0 | 34.25330 | -86.221800 |
| 6 | -86.254153 | 34.53375 | 7 | 10000600193 | 2022-2023 | AL | 100006 | AL-048 | Marshall County | Kate Duncan Smith DAR Middle | 6077 Main St | NaN | Grant | AL | 35747 | (256)728-5950 | No | Not Virtual | 05 | 08 | Middle | 1 | Regular School | Currently operational | 42-Rural: Distant | Marshall County | 151 | 123 | 28 | 194 | NaN | NaN | NaN | NaN | NaN | NaN | 95.0 | 97.0 | 86.0 | 86.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 192.0 | 172.0 | 364.0 | 364.0 | 22.030001 | 16.52 | 1.0 | 3.0 | 4.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 6.0 | 8.0 | 14.0 | 5.0 | 9.0 | 14.0 | 178.0 | 152.0 | 330.0 | 34.53375 | -86.254153 |
psChar_23.tail(7)
| X | Y | OBJECTID | NCESSCH | SURVYEAR | STABR | LEAID | ST_LEAID | LEA_NAME | SCH_NAME | LSTREET1 | LSTREET2 | LCITY | LSTATE | LZIP | LZIP4 | PHONE | CHARTER_TEXT | VIRTUAL | GSLO | GSHI | SCHOOL_LEVEL | STATUS | SCHOOL_TYPE_TEXT | SY_STATUS_TEXT | ULOCALE | NMCNTY | TOTFRL | FRELCH | REDLCH | DIRECTCERT | PK | KG | G01 | G02 | G03 | G04 | G05 | G06 | G07 | G08 | G09 | G10 | G11 | G12 | G13 | UG | AE | TOTMENROL | TOTFENROL | TOTAL | MEMBER | FTE | STUTERATIO | AMALM | AMALF | AM | ASALM | ASALF | AS | BLALM | BLALF | BL | HPALM | HPALF | HP | HIALM | HIALF | HI | TRALM | TRALF | TR | WHALM | WHALF | WH | LATCOD | LONCOD | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 101383 | -64.932456 | 18.352146 | 101384 | 780003000020 | 2022-2023 | VI | 7800030 | VI-001 | Saint Thomas - Saint John School District | JOSEPH SIBILLY ELEMENTARY SCHOOL | 14 15 16 ESTATE ELIZABETH | NaN | Saint Thomas | VI | 802 | (340)774-7001 | N | Not Virtual | PK | 06 | Elementary | 1 | Regular School | Currently operational | 33-Town: Remote | St. Thomas Island | 228 | 228 | 0 | -1 | 19.0 | 25.0 | 25.0 | 25.0 | 31.0 | 34.0 | 34.0 | 38.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 121.0 | 110.0 | 231.0 | 231.0 | 16.0 | 14.44 | 0.0 | 0.0 | 0.0 | 2.0 | 2.0 | 4.0 | 99.0 | 93.0 | 192.0 | 0.0 | 0.0 | 0.0 | 8.0 | 5.0 | 13.0 | 2.0 | 1.0 | 3.0 | 10.0 | 9.0 | 19.0 | 18.352146 | -64.932456 | |
| 101384 | -64.793916 | 18.330464 | 101385 | 780003000022 | 2022-2023 | VI | 7800030 | VI-001 | Saint Thomas - Saint John School District | JULIUS E SPRAUVE | 14 18 ESTATE ENIGHED | NaN | Saint John | VI | 831 | (340)776-6336 | N | Not Virtual | PK | 08 | Elementary | 1 | Regular School | Currently operational | 33-Town: Remote | St. John Island | 199 | 199 | 0 | -1 | 8.0 | 21.0 | 16.0 | 21.0 | 14.0 | 24.0 | 20.0 | 26.0 | 27.0 | 25.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 103.0 | 99.0 | 202.0 | 202.0 | 20.0 | 10.10 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 79.0 | 68.0 | 147.0 | 0.0 | 0.0 | 0.0 | 22.0 | 29.0 | 51.0 | 0.0 | 0.0 | 0.0 | 1.0 | 2.0 | 3.0 | 18.330464 | -64.793916 | |
| 101385 | -64.917602 | 18.341950 | 101386 | 780003000024 | 2022-2023 | VI | 7800030 | VI-001 | Saint Thomas - Saint John School District | LOCKHART ELEMENTARY SCHOOL | 41 ESTATE THOMAS | NaN | Saint Thomas | VI | 802 | (340)775-0820 | N | Not Virtual | KG | 03 | Elementary | 1 | Regular School | Currently operational | 33-Town: Remote | St. Thomas Island | 295 | 295 | 0 | -1 | NaN | 77.0 | 75.0 | 69.0 | 77.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 171.0 | 127.0 | 298.0 | 298.0 | 18.0 | 16.56 | 0.0 | 0.0 | 0.0 | 4.0 | 3.0 | 7.0 | 132.0 | 92.0 | 224.0 | 0.0 | 0.0 | 0.0 | 33.0 | 30.0 | 63.0 | 1.0 | 2.0 | 3.0 | 1.0 | 0.0 | 1.0 | 18.341950 | -64.917602 | |
| 101386 | -64.952483 | 18.338742 | 101387 | 780003000026 | 2022-2023 | VI | 7800030 | VI-001 | Saint Thomas - Saint John School District | ULLA F MULLER ELEMENTARY SCHOOL | 7B ESTATE CONTANT | NaN | Saint Thomas | VI | 802 | (340)774-0059 | N | Not Virtual | KG | 06 | Elementary | 1 | Regular School | Currently operational | 33-Town: Remote | St. Thomas Island | 417 | 417 | 0 | -1 | NaN | 52.0 | 53.0 | 51.0 | 47.0 | 70.0 | 79.0 | 68.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 200.0 | 220.0 | 420.0 | 420.0 | 28.0 | 15.00 | 0.0 | 2.0 | 2.0 | 2.0 | 4.0 | 6.0 | 167.0 | 182.0 | 349.0 | 0.0 | 0.0 | 0.0 | 27.0 | 27.0 | 54.0 | 2.0 | 0.0 | 2.0 | 2.0 | 5.0 | 7.0 | 18.338742 | -64.952483 | |
| 101387 | -64.899024 | 18.354782 | 101388 | 780003000027 | 2022-2023 | VI | 7800030 | VI-001 | Saint Thomas - Saint John School District | YVONNE BOWSKY ELEMENTARY SCHOOL | 15B and 16 ESTATE MANDAHL | NaN | Saint Thomas | VI | 802 | (340)775-3220 | N | Not Virtual | PK | 05 | Elementary | 1 | Regular School | Currently operational | 33-Town: Remote | St. Thomas Island | 425 | 425 | 0 | -1 | 22.0 | 62.0 | 67.0 | 66.0 | 75.0 | 68.0 | 68.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 252.0 | 176.0 | 428.0 | 428.0 | 34.0 | 12.59 | 1.0 | 1.0 | 2.0 | 5.0 | 4.0 | 9.0 | 201.0 | 144.0 | 345.0 | 0.0 | 0.0 | 0.0 | 37.0 | 22.0 | 59.0 | 0.0 | 1.0 | 1.0 | 8.0 | 4.0 | 12.0 | 18.354782 | -64.899024 | |
| 101388 | -64.945940 | 18.336658 | 101389 | 780003000033 | 2022-2023 | VI | 7800030 | VI-001 | Saint Thomas - Saint John School District | CANCRYN JUNIOR HIGH SCHOOL | 1 CROWN BAY | NaN | Saint Thomas | VI | 804 | (340)774-4540 | N | Not Virtual | 04 | 08 | Middle | 1 | Regular School | Currently operational | 33-Town: Remote | St. Thomas Island | 683 | 683 | 0 | -1 | NaN | NaN | NaN | NaN | NaN | 77.0 | 119.0 | 96.0 | 189.0 | 205.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 361.0 | 325.0 | 686.0 | 686.0 | 62.0 | 11.06 | 0.0 | 0.0 | 0.0 | 2.0 | 2.0 | 4.0 | 279.0 | 250.0 | 529.0 | 0.0 | 0.0 | 0.0 | 74.0 | 62.0 | 136.0 | 0.0 | 1.0 | 1.0 | 6.0 | 10.0 | 16.0 | 18.336658 | -64.945940 | |
| 101389 | -64.890311 | 18.318230 | 101390 | 780003000034 | 2022-2023 | VI | 7800030 | VI-001 | Saint Thomas - Saint John School District | BERTHA BOSCHULTE JUNIOR HIGH | 9 1 and 12A BOVONI | NaN | Saint Thomas | VI | 802 | (340)775-4222 | N | Not Virtual | 06 | 08 | Middle | 1 | Regular School | Currently operational | 33-Town: Remote | St. Thomas Island | 504 | 504 | 0 | -1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 145.0 | 169.0 | 193.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 279.0 | 228.0 | 507.0 | 507.0 | 49.0 | 10.35 | 0.0 | 0.0 | 0.0 | 2.0 | 1.0 | 3.0 | 250.0 | 204.0 | 454.0 | 0.0 | 0.0 | 0.0 | 27.0 | 21.0 | 48.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 2.0 | 18.318230 | -64.890311 |
psChar_23.shape
(101390, 77)
- The dataframe has 101,390 rows of data.
- The dataframe has 77 columns or features.
- There are 6,894,520 total datapoints observed in the dataset.
psChar_23.info(show_counts=True, verbose=True)
<class 'pandas.core.frame.DataFrame'> RangeIndex: 101390 entries, 0 to 101389 Data columns (total 77 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 X 101390 non-null float64 1 Y 101390 non-null float64 2 OBJECTID 101390 non-null int64 3 NCESSCH 101390 non-null int64 4 SURVYEAR 101390 non-null object 5 STABR 101390 non-null object 6 LEAID 101390 non-null int64 7 ST_LEAID 101390 non-null object 8 LEA_NAME 101390 non-null object 9 SCH_NAME 101390 non-null object 10 LSTREET1 101389 non-null object 11 LSTREET2 572 non-null object 12 LCITY 101390 non-null object 13 LSTATE 101390 non-null object 14 LZIP 101390 non-null int64 15 LZIP4 101390 non-null object 16 PHONE 101390 non-null object 17 CHARTER_TEXT 101390 non-null object 18 VIRTUAL 101390 non-null object 19 GSLO 101390 non-null object 20 GSHI 101390 non-null object 21 SCHOOL_LEVEL 101390 non-null object 22 STATUS 101390 non-null int64 23 SCHOOL_TYPE_TEXT 101390 non-null object 24 SY_STATUS_TEXT 101390 non-null object 25 ULOCALE 101390 non-null object 26 NMCNTY 101390 non-null object 27 TOTFRL 101390 non-null int64 28 FRELCH 101390 non-null int64 29 REDLCH 101390 non-null int64 30 DIRECTCERT 101390 non-null int64 31 PK 32392 non-null float64 32 KG 54061 non-null float64 33 G01 54412 non-null float64 34 G02 54469 non-null float64 35 G03 54459 non-null float64 36 G04 54258 non-null float64 37 G05 53014 non-null float64 38 G06 38023 non-null float64 39 G07 33224 non-null float64 40 G08 33492 non-null float64 41 G09 28101 non-null float64 42 G10 27889 non-null float64 43 G11 27888 non-null float64 44 G12 27816 non-null float64 45 G13 133 non-null float64 46 UG 7889 non-null float64 47 AE 183 non-null float64 48 TOTMENROL 98910 non-null float64 49 TOTFENROL 98910 non-null float64 50 TOTAL 99719 non-null float64 51 MEMBER 99719 non-null float64 52 FTE 97537 non-null float64 53 STUTERATIO 99576 non-null float64 54 AMALM 98809 non-null float64 55 AMALF 98811 non-null float64 56 AM 98857 non-null float64 57 ASALM 98898 non-null float64 58 ASALF 98900 non-null float64 59 AS 98906 non-null float64 60 BLALM 98896 non-null float64 61 BLALF 98893 non-null float64 62 BL 98903 non-null float64 63 HPALM 98782 non-null float64 64 HPALF 98783 non-null float64 65 HP 98829 non-null float64 66 HIALM 98909 non-null float64 67 HIALF 98910 non-null float64 68 HI 98910 non-null float64 69 TRALM 98903 non-null float64 70 TRALF 98905 non-null float64 71 TR 98906 non-null float64 72 WHALM 98909 non-null float64 73 WHALF 98909 non-null float64 74 WH 98910 non-null float64 75 LATCOD 101390 non-null float64 76 LONCOD 101390 non-null float64 dtypes: float64(48), int64(9), object(20) memory usage: 59.6+ MB
ps23Cols = psChar_23.columns
ps23Cols
Index(['X', 'Y', 'OBJECTID', 'NCESSCH', 'SURVYEAR', 'STABR', 'LEAID',
'ST_LEAID', 'LEA_NAME', 'SCH_NAME', 'LSTREET1', 'LSTREET2', 'LCITY',
'LSTATE', 'LZIP', 'LZIP4', 'PHONE', 'CHARTER_TEXT', 'VIRTUAL', 'GSLO',
'GSHI', 'SCHOOL_LEVEL', 'STATUS', 'SCHOOL_TYPE_TEXT', 'SY_STATUS_TEXT',
'ULOCALE', 'NMCNTY', 'TOTFRL', 'FRELCH', 'REDLCH', 'DIRECTCERT', 'PK',
'KG', 'G01', 'G02', 'G03', 'G04', 'G05', 'G06', 'G07', 'G08', 'G09',
'G10', 'G11', 'G12', 'G13', 'UG', 'AE', 'TOTMENROL', 'TOTFENROL',
'TOTAL', 'MEMBER', 'FTE', 'STUTERATIO', 'AMALM', 'AMALF', 'AM', 'ASALM',
'ASALF', 'AS', 'BLALM', 'BLALF', 'BL', 'HPALM', 'HPALF', 'HP', 'HIALM',
'HIALF', 'HI', 'TRALM', 'TRALF', 'TR', 'WHALM', 'WHALF', 'WH', 'LATCOD',
'LONCOD'],
dtype='object')
psChar_23 = psChar_23.rename(columns = {'OBJECTID':'ObjectID','NCESSCH':'NCESID','SURVYEAR':'SurveyYear',
'STABR':'StateABR','LEA_NAME':'LEAname','SCH_NAME':'SchoolName',
'LSTREET1':'Street1','LSTREET2':'Street2','LCITY':'City',
'LSTATE':'State','LZIP':'Zip','LZIP4':'Zip4',
'PHONE':'Phone', 'CHARTER_TEXT':'Charter', 'VIRTUAL':'Virtual',
'GSLO':'LowestGrade','GSHI':'HighestGrade',
'SCHOOL_LEVEL':'SchoolLevel',
'STATUS':'Status', 'SCHOOL_TYPE_TEXT':'SchoolType',
'SY_STATUS_TEXT':'Status_Text',
'ULOCALE':'Locale', 'NMCNTY':'County',
'TOTFRL':'TotalFreeLunch',
'FRELCH':'FreeLunch', 'REDLCH':'ReducedLunch',
'DIRECTCERT':'MealProgramCertified', 'PK':'PreK',
'KG':'Kindergarten', 'G01':'Grade1', 'G02':'Grade2',
'G03':'Grade3', 'G04':'Grade4', 'G05':'Grade5',
'G06':'Grade6', 'G07':'Grade7', 'G08':'Grade8',
'G09':'Grade9','G10':'Grade10', 'G11':'Grade11',
'G12':'Grade12','G13':'Grade13', 'UG':'Ungraded',
'AE':'AdultEd', 'TOTMENROL':'TotMaleEnrollment',
'TOTFENROL':'TotFemaleEnrollment','TOTAL':'TotalEnrollment',
'MEMBER':'Member', 'FTE':'StaffFTE', 'STUTERATIO':'StudentTeacherRatio',
'AMALM':'AIANMale','AMALF':'AIANFem', 'AM':'AIANTotal',
'ASALM':'AsianMale', 'ASALF':'AsianFemale', 'AS':'AsianTotal',
'BLALM':'BlackMale','BLALF':'BlackFemale', 'BL':'BlackTotal',
'HPALM':'HPIMale', 'HPALF':'HPIFemale', 'HP':'HPITotal',
'HIALM':'HispanicMale','HIALF':'HispanicFemale', 'HI':'HispanicTotal',
'TRALM':'TRMale', 'TRALF':'TRFemale', 'TR':'TRTotal',
'WHALM':'WhiteMale','WHALF':'WhiteFemale', 'WH':'WhiteTotal',
'LATCOD':'Latitude','LONCOD':'Longitude'})
ps23Cols = psChar_23.columns
psChar_23.head()
| X | Y | ObjectID | NCESID | SurveyYear | StateABR | LEAID | ST_LEAID | LEAname | SchoolName | Street1 | Street2 | City | State | Zip | Zip4 | Phone | Charter | Virtual | LowestGrade | HighestGrade | SchoolLevel | Status | SchoolType | Status_Text | Locale | County | TotalFreeLunch | FreeLunch | ReducedLunch | MealProgramCertified | PreK | Kindergarten | Grade1 | Grade2 | Grade3 | Grade4 | Grade5 | Grade6 | Grade7 | Grade8 | Grade9 | Grade10 | Grade11 | Grade12 | Grade13 | Ungraded | AdultEd | TotMaleEnrollment | TotFemaleEnrollment | TotalEnrollment | Member | StaffFTE | StudentTeacherRatio | AIANMale | AIANFem | AIANTotal | AsianMale | AsianFemale | AsianTotal | BlackMale | BlackFemale | BlackTotal | HPIMale | HPIFemale | HPITotal | HispanicMale | HispanicFemale | HispanicTotal | TRMale | TRFemale | TRTotal | WhiteMale | WhiteFemale | WhiteTotal | Latitude | Longitude | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -86.206200 | 34.2602 | 1 | 10000500870 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Middle School | 600 E Alabama Ave | NaN | Albertville | AL | 35950 | (256)878-2341 | No | Not Virtual | 07 | 08 | Middle | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 697 | 654 | 43 | 587 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 440.0 | 450.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 459.0 | 431.0 | 890.0 | 890.0 | 45.000000 | 19.78 | 4.0 | 1.0 | 5.0 | 4.0 | 2.0 | 6.0 | 15.0 | 14.0 | 29.0 | 0.0 | 1.0 | 1.0 | 251.0 | 251.0 | 502.0 | 17.0 | 15.0 | 32.0 | 168.0 | 147.0 | 315.0 | 34.2602 | -86.206200 | |
| 1 | -86.204900 | 34.2622 | 2 | 10000500871 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville High School | 402 E McCord Ave | NaN | Albertville | AL | 35950 | 2322 | (256)894-5000 | No | Not Virtual | 09 | 12 | High | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 1254 | 1178 | 76 | 1059 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 493.0 | 442.0 | 390.0 | 387.0 | NaN | NaN | NaN | 868.0 | 844.0 | 1712.0 | 1712.0 | 85.199997 | 20.09 | 0.0 | 2.0 | 2.0 | 4.0 | 5.0 | 9.0 | 23.0 | 34.0 | 57.0 | 0.0 | 0.0 | 0.0 | 490.0 | 468.0 | 958.0 | 26.0 | 19.0 | 45.0 | 325.0 | 316.0 | 641.0 | 34.2622 | -86.204900 |
| 2 | -86.220100 | 34.2733 | 3 | 10000500879 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Intermediate School | 901 W McKinney Ave | NaN | Albertville | AL | 35950 | 1300 | (256)878-7698 | No | Not Virtual | 05 | 06 | Middle | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 718 | 665 | 53 | 570 | NaN | NaN | NaN | NaN | NaN | NaN | 412.0 | 462.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 451.0 | 423.0 | 874.0 | 874.0 | 43.000000 | 20.33 | 1.0 | 4.0 | 5.0 | 4.0 | 0.0 | 4.0 | 22.0 | 28.0 | 50.0 | 0.0 | 0.0 | 0.0 | 263.0 | 241.0 | 504.0 | 7.0 | 6.0 | 13.0 | 154.0 | 144.0 | 298.0 | 34.2733 | -86.220100 |
| 3 | -86.221806 | 34.2527 | 4 | 10000500889 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Elementary School | 145 West End Drive | NaN | Albertville | AL | 35950 | (256)894-4822 | No | Not Virtual | 03 | 04 | Elementary | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 723 | 680 | 43 | 583 | NaN | NaN | NaN | NaN | 430.0 | 444.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 463.0 | 411.0 | 874.0 | 874.0 | 43.000000 | 20.33 | 0.0 | 4.0 | 4.0 | 1.0 | 3.0 | 4.0 | 22.0 | 16.0 | 38.0 | 0.0 | 0.0 | 0.0 | 261.0 | 236.0 | 497.0 | 11.0 | 16.0 | 27.0 | 168.0 | 136.0 | 304.0 | 34.2527 | -86.221806 | |
| 4 | -86.193300 | 34.2898 | 5 | 10000501616 | 2022-2023 | AL | 100005 | AL-101 | Albertville City | Albertville Kindergarten and PreK | 257 Country Club Rd | NaN | Albertville | AL | 35951 | 3927 | (256)878-7922 | No | Not Virtual | PK | KG | Elementary | 1 | Regular School | Currently operational | 32-Town: Distant | Marshall County | 392 | 367 | 25 | 240 | 133.0 | 473.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 304.0 | 302.0 | 606.0 | 606.0 | 26.000000 | 23.31 | 1.0 | 3.0 | 4.0 | 2.0 | 0.0 | 2.0 | 26.0 | 23.0 | 49.0 | 0.0 | 0.0 | 0.0 | 167.0 | 152.0 | 319.0 | 4.0 | 4.0 | 8.0 | 104.0 | 120.0 | 224.0 | 34.2898 | -86.193300 |
psChar_23.isnull().sum()
X 0 Y 0 ObjectID 0 NCESID 0 SurveyYear 0 StateABR 0 LEAID 0 ST_LEAID 0 LEAname 0 SchoolName 0 Street1 1 Street2 100818 City 0 State 0 Zip 0 Zip4 0 Phone 0 Charter 0 Virtual 0 LowestGrade 0 HighestGrade 0 SchoolLevel 0 Status 0 SchoolType 0 Status_Text 0 Locale 0 County 0 TotalFreeLunch 0 FreeLunch 0 ReducedLunch 0 MealProgramCertified 0 PreK 68998 Kindergarten 47329 Grade1 46978 Grade2 46921 Grade3 46931 Grade4 47132 Grade5 48376 Grade6 63367 Grade7 68166 Grade8 67898 Grade9 73289 Grade10 73501 Grade11 73502 Grade12 73574 Grade13 101257 Ungraded 93501 AdultEd 101207 TotMaleEnrollment 2480 TotFemaleEnrollment 2480 TotalEnrollment 1671 Member 1671 StaffFTE 3853 StudentTeacherRatio 1814 AIANMale 2581 AIANFem 2579 AIANTotal 2533 AsianMale 2492 AsianFemale 2490 AsianTotal 2484 BlackMale 2494 BlackFemale 2497 BlackTotal 2487 HPIMale 2608 HPIFemale 2607 HPITotal 2561 HispanicMale 2481 HispanicFemale 2480 HispanicTotal 2480 TRMale 2487 TRFemale 2485 TRTotal 2484 WhiteMale 2481 WhiteFemale 2481 WhiteTotal 2480 Latitude 0 Longitude 0 dtype: int64
def missing(DataFrame):
print('Percentage of missing values in the dataset:\n',
round((DataFrame.isnull().sum() *100/len(DataFrame)), 2).sort_values(ascending=False))
missing(psChar_23)
Percentage of missing values in the dataset: Grade13 99.87 AdultEd 99.82 Street2 99.44 Ungraded 92.22 Grade12 72.57 Grade10 72.49 Grade11 72.49 Grade9 72.28 PreK 68.05 Grade7 67.23 Grade8 66.97 Grade6 62.50 Grade5 47.71 Kindergarten 46.68 Grade4 46.49 Grade1 46.33 Grade3 46.29 Grade2 46.28 StaffFTE 3.80 HPIFemale 2.57 HPIMale 2.57 AIANMale 2.55 AIANFem 2.54 HPITotal 2.53 AIANTotal 2.50 AsianMale 2.46 BlackMale 2.46 AsianFemale 2.46 BlackFemale 2.46 WhiteFemale 2.45 WhiteTotal 2.45 TRTotal 2.45 AsianTotal 2.45 BlackTotal 2.45 HispanicMale 2.45 HispanicFemale 2.45 HispanicTotal 2.45 WhiteMale 2.45 TotFemaleEnrollment 2.45 TRFemale 2.45 TRMale 2.45 TotMaleEnrollment 2.45 StudentTeacherRatio 1.79 TotalEnrollment 1.65 Member 1.65 City 0.00 Street1 0.00 SchoolName 0.00 LEAname 0.00 LEAID 0.00 ST_LEAID 0.00 StateABR 0.00 SurveyYear 0.00 X 0.00 NCESID 0.00 ObjectID 0.00 Y 0.00 ReducedLunch 0.00 MealProgramCertified 0.00 TotalFreeLunch 0.00 FreeLunch 0.00 Zip 0.00 State 0.00 Zip4 0.00 Phone 0.00 Charter 0.00 Virtual 0.00 LowestGrade 0.00 HighestGrade 0.00 SchoolLevel 0.00 Status 0.00 SchoolType 0.00 Status_Text 0.00 Locale 0.00 County 0.00 Latitude 0.00 Longitude 0.00 dtype: float64
Observations¶
A total of eighteen columns have missing value percentages above forty-five percent. For the 'Grade' columns, this could be explained because this dataset includes schools at various education levels, meaning some schools might not offer certain grade levels. Furthermore, there are many missing values specifically for the columns regarding free/reduced lunch and the student to teacher ratio. As indicated in the description of this dataset online, these missing values are represented by a number of indicators: -1 indicates that data is missing, -2 or N indicates that data is not applicable, and -9 indicates that data did not meet NCES data quality standards. Given this information, I would drop the AdultEd and Grade13 columns, as this research is focused only on youth sports participation in traditional public schools. I would also drop columns 'Phone', 'LEAName', 'LEADID', 'ST_LEAID', 'SurveyYear', 'StaffFTE', 'Member', and 'NCESID', as they are not necessary for analysis. I also plan to remove the columns with negative values.
dropCols = ['AdultEd','Phone','LEAname','LEAID','ST_LEAID','SurveyYear','StaffFTE','Member','NCESID','Grade13']
psChar_23 = psChar_23.drop(columns=dropCols)
psChar_23
psChar_23.isnull().sum()
X 0 Y 0 ObjectID 0 StateABR 0 SchoolName 0 Street1 1 Street2 100818 City 0 State 0 Zip 0 Zip4 0 Charter 0 Virtual 0 LowestGrade 0 HighestGrade 0 SchoolLevel 0 Status 0 SchoolType 0 Status_Text 0 Locale 0 County 0 TotalFreeLunch 0 FreeLunch 0 ReducedLunch 0 MealProgramCertified 0 PreK 68998 Kindergarten 47329 Grade1 46978 Grade2 46921 Grade3 46931 Grade4 47132 Grade5 48376 Grade6 63367 Grade7 68166 Grade8 67898 Grade9 73289 Grade10 73501 Grade11 73502 Grade12 73574 Ungraded 93501 TotMaleEnrollment 2480 TotFemaleEnrollment 2480 TotalEnrollment 1671 StudentTeacherRatio 1814 AIANMale 2581 AIANFem 2579 AIANTotal 2533 AsianMale 2492 AsianFemale 2490 AsianTotal 2484 BlackMale 2494 BlackFemale 2497 BlackTotal 2487 HPIMale 2608 HPIFemale 2607 HPITotal 2561 HispanicMale 2481 HispanicFemale 2480 HispanicTotal 2480 TRMale 2487 TRFemale 2485 TRTotal 2484 WhiteMale 2481 WhiteFemale 2481 WhiteTotal 2480 Latitude 0 Longitude 0 dtype: int64
psChar_23["Status_Text"].unique() #check to see if the schools are operational
psChar_23 = psChar_23[psChar_23["Status_Text"].str.contains(
"School to be operational within two years|School temporarily closed", na=False) ==False]
psChar_23["SchoolType"].unique() #check to see the types of schools listed in the dataset, only looking at traditional schools so we can cut the others out
psChar_23 = psChar_23[psChar_23["SchoolType"].str.contains(
"Regular School", na=False)]
# filter out negative FRPL (free and reduced price lunch) values & student teacher ratios
negativeCols = ['ReducedLunch', 'MealProgramCertified','FreeLunch','StudentTeacherRatio']
psChar_23 = psChar_23[(psChar_23[negativeCols] >= 0).all(axis=1)]
psChar_23.shape
(37392, 67)
psChar_23['Locale'].unique()
array(['32-Town: Distant', '42-Rural: Distant', '41-Rural: Fringe',
'13-City: Small', '21-Suburb: Large', '33-Town: Remote',
'31-Town: Fringe', '23-Suburb: Small', '12-City: Mid-size',
'43-Rural: Remote', '22-Suburb: Mid-size', '11-City: Large'],
dtype=object)
Locale = {'42-Rural: Distant':'Rural',
'41-Rural: Fringe':'Rural',
'43-Rural: Remote':'Rural',
'32-Town: Distant':'Town',
'33-Town: Remote':'Town',
'31-Town: Fringe':'Town',
'13-City: Small':'City',
'12-City: Mid-size':'City',
'11-City: Large':'City',
'21-Suburb: Large':'Suburb',
'23-Suburb: Small':'Suburb',
'22-Suburb: Mid-size':'Suburb'}
Locale
{'42-Rural: Distant': 'Rural',
'41-Rural: Fringe': 'Rural',
'43-Rural: Remote': 'Rural',
'32-Town: Distant': 'Town',
'33-Town: Remote': 'Town',
'31-Town: Fringe': 'Town',
'13-City: Small': 'City',
'12-City: Mid-size': 'City',
'11-City: Large': 'City',
'21-Suburb: Large': 'Suburb',
'23-Suburb: Small': 'Suburb',
'22-Suburb: Mid-size': 'Suburb'}
psChar_23['Locale'] = psChar_23['Locale'].map(Locale)
psChar_23['Locale'].unique()
array(['Town', 'Rural', 'City', 'Suburb'], dtype=object)
psChar_23.describe()
| X | Y | ObjectID | Zip | Status | TotalFreeLunch | FreeLunch | ReducedLunch | MealProgramCertified | PreK | Kindergarten | Grade1 | Grade2 | Grade3 | Grade4 | Grade5 | Grade6 | Grade7 | Grade8 | Grade9 | Grade10 | Grade11 | Grade12 | Ungraded | TotMaleEnrollment | TotFemaleEnrollment | TotalEnrollment | StudentTeacherRatio | AIANMale | AIANFem | AIANTotal | AsianMale | AsianFemale | AsianTotal | BlackMale | BlackFemale | BlackTotal | HPIMale | HPIFemale | HPITotal | HispanicMale | HispanicFemale | HispanicTotal | TRMale | TRFemale | TRTotal | WhiteMale | WhiteFemale | WhiteTotal | Latitude | Longitude | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 37392.000000 | 37392.000000 | 37392.000000 | 37392.000000 | 37392.000000 | 37392.000000 | 37392.000000 | 37392.000000 | 37392.000000 | 11978.000000 | 22550.000000 | 22629.000000 | 22642.000000 | 22602.000000 | 22538.000000 | 22257.000000 | 14797.000000 | 11647.000000 | 11603.000000 | 8163.000000 | 8083.000000 | 8065.000000 | 8051.000000 | 1952.000000 | 36722.000000 | 36722.000000 | 37392.000000 | 37392.000000 | 36672.000000 | 36670.000000 | 36700.000000 | 36717.000000 | 36720.000000 | 36722.000000 | 36713.000000 | 36711.000000 | 36717.000000 | 36661.000000 | 36663.000000 | 36689.000000 | 36722.000000 | 36722.000000 | 36722.000000 | 36719.000000 | 36720.000000 | 36720.000000 | 36722.000000 | 36722.000000 | 36722.000000 | 37392.000000 | 37392.000000 |
| mean | -100.251468 | 37.290953 | 36287.616683 | 63446.899497 | 1.014549 | 329.526610 | 294.008478 | 35.518132 | 211.785756 | 32.782017 | 72.719335 | 71.254938 | 69.543636 | 71.746350 | 71.135815 | 72.845082 | 110.678448 | 141.237400 | 144.294062 | 223.755115 | 217.416924 | 200.690763 | 191.246429 | 5.592725 | 298.766843 | 283.438457 | 582.363982 | 17.143016 | 3.452471 | 3.328279 | 6.775395 | 18.924422 | 17.763154 | 36.684031 | 45.420178 | 43.891340 | 89.299398 | 1.804724 | 1.707362 | 3.509499 | 90.627880 | 86.666930 | 177.294810 | 16.256734 | 15.626416 | 31.882707 | 122.303170 | 114.477398 | 236.780568 | 37.290953 | -100.251468 |
| std | 19.640040 | 6.016159 | 29092.351157 | 28541.959418 | 0.193091 | 302.519034 | 276.192414 | 57.319530 | 205.370496 | 38.554464 | 43.464381 | 41.309698 | 40.438018 | 41.754483 | 42.033706 | 46.943546 | 107.750815 | 131.743413 | 135.120098 | 228.483527 | 214.329761 | 199.781788 | 191.818897 | 9.279454 | 244.643853 | 236.573281 | 478.938331 | 13.327592 | 16.876568 | 16.257685 | 33.002197 | 51.673689 | 48.952495 | 100.345711 | 81.985733 | 80.888984 | 162.123805 | 10.492019 | 9.835723 | 20.222761 | 136.744539 | 131.278507 | 267.380097 | 19.221038 | 18.730197 | 37.461901 | 131.928343 | 126.522191 | 257.671934 | 6.016159 | 19.640040 |
| min | -171.715402 | 14.140873 | 1.000000 | 3901.000000 | 1.000000 | 3.000000 | 0.000000 | 0.000000 | 3.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 9.000000 | 0.610000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 14.140873 | -171.715402 |
| 25% | -118.201758 | 33.753817 | 11736.500000 | 34249.000000 | 1.000000 | 136.000000 | 115.000000 | 5.000000 | 73.000000 | 10.000000 | 45.000000 | 45.000000 | 44.000000 | 45.000000 | 44.000000 | 44.000000 | 36.000000 | 34.000000 | 35.000000 | 44.000000 | 44.000000 | 42.000000 | 40.000000 | 0.000000 | 157.000000 | 148.000000 | 308.000000 | 13.490000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 2.000000 | 1.000000 | 3.000000 | 0.000000 | 0.000000 | 0.000000 | 11.000000 | 11.000000 | 22.000000 | 4.000000 | 3.000000 | 7.000000 | 26.000000 | 24.000000 | 50.000000 | 33.753816 | -118.201758 |
| 50% | -93.889011 | 36.964123 | 26215.500000 | 64014.000000 | 1.000000 | 261.000000 | 230.000000 | 20.000000 | 158.000000 | 24.000000 | 69.000000 | 68.000000 | 66.000000 | 68.000000 | 67.000000 | 68.000000 | 73.000000 | 95.000000 | 97.000000 | 135.000000 | 132.000000 | 120.000000 | 113.000000 | 2.000000 | 244.000000 | 230.000000 | 474.000000 | 16.140000 | 1.000000 | 0.000000 | 1.000000 | 3.000000 | 3.000000 | 6.000000 | 11.000000 | 10.000000 | 21.000000 | 0.000000 | 0.000000 | 0.000000 | 40.000000 | 38.000000 | 78.000000 | 11.000000 | 10.000000 | 22.000000 | 89.000000 | 82.000000 | 171.000000 | 36.964123 | -93.889011 |
| 75% | -84.473956 | 39.752610 | 51456.250000 | 92405.000000 | 1.000000 | 427.000000 | 387.000000 | 45.000000 | 285.000000 | 43.000000 | 95.000000 | 92.000000 | 90.000000 | 93.000000 | 92.000000 | 94.000000 | 150.000000 | 228.000000 | 232.000000 | 373.000000 | 361.000000 | 330.000000 | 310.000000 | 8.000000 | 358.000000 | 338.000000 | 694.000000 | 20.000000 | 2.000000 | 2.000000 | 3.000000 | 14.000000 | 13.000000 | 27.000000 | 54.000000 | 52.000000 | 106.000000 | 1.000000 | 1.000000 | 2.000000 | 118.000000 | 114.000000 | 233.000000 | 22.000000 | 21.000000 | 44.000000 | 172.000000 | 160.000000 | 332.000000 | 39.752610 | -84.473956 |
| max | 145.784430 | 71.298478 | 100508.000000 | 99950.000000 | 8.000000 | 5770.000000 | 5563.000000 | 1400.000000 | 2921.000000 | 903.000000 | 873.000000 | 646.000000 | 665.000000 | 691.000000 | 669.000000 | 727.000000 | 923.000000 | 844.000000 | 930.000000 | 6251.000000 | 2855.000000 | 1293.000000 | 1339.000000 | 223.000000 | 4352.000000 | 4524.000000 | 8876.000000 | 1860.000000 | 585.000000 | 513.000000 | 1098.000000 | 1335.000000 | 1224.000000 | 2559.000000 | 2195.000000 | 2207.000000 | 4402.000000 | 556.000000 | 440.000000 | 996.000000 | 1947.000000 | 2118.000000 | 4065.000000 | 436.000000 | 422.000000 | 828.000000 | 1989.000000 | 2305.000000 | 4294.000000 | 71.298478 | 145.784430 |
psChar_23og = psChar_23
Observations of Descriptive Statistics¶
(Min, Max):
- TotalFreeLunch (3, 5770); FreeLunch (0, 5563); ReducedLunch (0, 1400); MealProgramCertified (3, 2921)
- PreK (0, 903)
- Kindergarten (0, 873)
- Grade1 (0, 646)
- Grade2 (0, 665)
- Grade3 (0, 691)
- Grade4 (0, 669)
- Grade5 (0, 727)
- Grade6 (0, 923)
- Grade7 (0, 844)
- Grade8 (0, 930)
- Grade9 (0, 6251)
- Grade10 (0, 2855)
- Grade11 (0, 1293)
- Grade12 (0, 1339)
- Ungraded (0, 223)
- Total Male Enrollment (0, 4352); Total Female Enrollment (0, 4524); Total Enrollment (9, 8876)
- Student Teacher Ratio (0, 1860)
- American Indian/Alaskan Native Male (0, 585); American Indian/Alaskan Native Female (0, 513); American Indian/Alaskan Native Total (0, 1098)
- Asian Male (0, 1335); Asian Female (0, 1224); Asian Total (0, 2559)
- Black Male (0, 2195); Black Female (0, 2207); Black Total (0, 4402)
- Native Hawaiian/Pacific Islander(HPI) Male (0, 556); Native Hawaiian/Pacific Islander(HPI) Female (0, 440); Native Hawaiian/Pacific Islander(HPI) Total (0, 996)
- Hispanic Male (0, 1947); Hispanic Female (0, 2118); Hispanic Total (0, 4065)
- Two or More Races Male (0, 436); Two or More Races Female (0, 422); Two or More Races Total (0, 828)
- White Male (0, 1989); White Female (0, 2305); White Total (0, 4294)
Mean:
- TotalFreeLunch: 329.53; FreeLunch: 294.01; ReducedLunch: 35.52; MealProgramCertified: 211.79
- PreK: 32.78 students
- Kindergarten: 72.72 students
- Grade1: 71.25 students
- Grade2: 69.54 students
- Grade3: 71.75 students
- Grade4: 71.14 students
- Grade5: 72.85 students
- Grade6: 110.68 students
- Grade7: 141.24 students
- Grade8: 144.29 students
- Grade9: 223.76 students
- Grade10: 217.42 students
- Grade11: 200.69 students
- Grade12: 191.25 students
- Ungraded: 5.59 students
- Total Male Enrollment: 298.77 students; Total Female Enrollment: 283.44 students; Total Enrollment: 582.36 students
- Student Teacher Ratio: 17.14 students/teacher
- American Indian/Alaskan Native Male: 3.45 students; American Indian/Alaskan Native Female: 3.33 students; American Indian/Alaskan Native Total: 6.78 students
- Asian Male: 18.92 students; Asian Female 17.76 students; Asian Total: 36.68 students
- Black Male: 45.42 students; Black Female: 43.89 students; Black Total: 89.30 students
- Native Hawaiian/Pacific Islander(HPI) Male: 1.80 students; Native Hawaiian/Pacific Islander(HPI) Female: 1.71 students; Native Hawaiian/Pacific Islander(HPI) Total: 3.51 students
- Hispanic Male: 90.63 students; Hispanic Female: 86.67 students; Hispanic Total: 177.29 students
- Two or More Races Male: 16.26 students; Two or More Races Female: 15.63 students; Two or More Races Total: 31.82 students
- White Male: 122.30 students; White Female: 114.48 students; White Total: 236.78 students
Quartile Ranges (25%, 75%):
- TotalFreeLunch: (136, 427); FreeLunch: (115, 387); ReducedLunch (5, 45); MealProgramCertified: (73, 285)
- PreK: (10, 43)
- Kindergarten: (45, 95)
- Grade1: (45, 92)
- Grade2: (44, 90)
- Grade3: (45, 93)
- Grade4: (44, 92)
- Grade5: (44, 94)
- Grade6: (36, 150)
- Grade7: (34, 228)
- Grade8: (35, 232)
- Grade9: (44, 373)
- Grade10: (44, 361)
- Grade11: (42, 330)
- Grade12: (40, 310)
- Ungraded: (2, 8)
- Total Male Enrollment: (157, 358); Total Female Enrollment: (148, 338); Total Enrollment: (308, 694)
- Student Teacher Ratio: (13.49, 20)
- American Indian/Alaskan Native Male: (0, 2); American Indian/Alaskan Native Female: (0, 2); American Indian/Alaskan Native Total: (0, 3)
- Asian Male: (0, 14); Asian Female: (0, 13); Asian Total: (1, 27)
- Black Male: (1, 54); Black Female: (1, 52); Black Total: (3, 106)
- Native Hawaiian/Pacific Islander(HPI) Male: (0, 1); Native Hawaiian/Pacific Islander(HPI) Female: (0, 1); Native Hawaiian/Pacific Islander(HPI) Total: (0, 2)
- Hispanic Male: (11, 118); Hispanic Female: (11, 114); Hispanic Total: (22, 233)
- Two or More Races Male: (4, 22); Two or More Races Female: (3, 21); Two or More Races Total: (7, 44)
- White Male: (26, 172); White Female: (24, 160); White Total: (50, 332)
Standard Deviation:
Higher than mean- Reduced Lunch, PreK, Grade9, Grade12, Ungraded; all student races Lower- Total Free Lunch, Free Lunch, Meal Program Certified, all grades (except 9 and 12), Total Male Enrollment, Total Female Enrollment, Total Enrollment, Student to Teacher Ratio
FRPL rates- the std's are moderately lower than the means, excluding the std for ReducedLunch which is higher than the mean.
The standard deviations for PreK, and Grades 9 and 12, are higher than the means, while all other grades are lower.
The standard deviations for enrollment rates are lower than the means.
The standard deviation for the student to teacher ratio is lower than the mean.
The standard deviations for all student demographics are higher than the means, though the disparity found in White student demographics is much less significant compared to other races/ethnicities.
Mean/Median Closeness:
The medians for the free/reduced lunch status of the schools are lower than the means.
For the columns covering the elementary school grades, the medians are close but lower than the mean values. For the other grades, the medians are not as close, but are still lower than the means.
The median for the student-teacher ratio is close to the mean.
The median values for the Black and Hispanic student demographics are significantly lower than the mean values.
print(psChar_23["StudentTeacherRatio"].describe())
count 37392.000000 mean 17.143016 std 13.327592 min 0.610000 25% 13.490000 50% 16.140000 75% 20.000000 max 1860.000000 Name: StudentTeacherRatio, dtype: float64
psChar_23.columns
Index(['X', 'Y', 'ObjectID', 'StateABR', 'SchoolName', 'Street1', 'Street2',
'City', 'State', 'Zip', 'Zip4', 'Charter', 'Virtual', 'LowestGrade',
'HighestGrade', 'SchoolLevel', 'Status', 'SchoolType', 'Status_Text',
'Locale', 'County', 'TotalFreeLunch', 'FreeLunch', 'ReducedLunch',
'MealProgramCertified', 'PreK', 'Kindergarten', 'Grade1', 'Grade2',
'Grade3', 'Grade4', 'Grade5', 'Grade6', 'Grade7', 'Grade8', 'Grade9',
'Grade10', 'Grade11', 'Grade12', 'Ungraded', 'TotMaleEnrollment',
'TotFemaleEnrollment', 'TotalEnrollment', 'StudentTeacherRatio',
'AIANMale', 'AIANFem', 'AIANTotal', 'AsianMale', 'AsianFemale',
'AsianTotal', 'BlackMale', 'BlackFemale', 'BlackTotal', 'HPIMale',
'HPIFemale', 'HPITotal', 'HispanicMale', 'HispanicFemale',
'HispanicTotal', 'TRMale', 'TRFemale', 'TRTotal', 'WhiteMale',
'WhiteFemale', 'WhiteTotal', 'Latitude', 'Longitude'],
dtype='object')
print(type(psChar_23))
<class 'pandas.core.frame.DataFrame'>
Scatter Plots¶
import matplotlib.pyplot as plt
import numpy as np
psChar_23 = psChar_23[psChar_23["TotalFreeLunch"] <= psChar_23["TotalEnrollment"]]
psChar_23["LunchRate"] = (psChar_23["TotalFreeLunch"] / psChar_23["TotalEnrollment"]) * 100
race_colors = {"BlackTotal": "tab:blue", "HispanicTotal": "tab:orange", "WhiteTotal": "tab:green"}
size_values = {"BlackTotal": 45, "HispanicTotal": 45, "WhiteTotal": 45}
psChar_23["PredominantRace"] = psChar_23[["BlackTotal", "HispanicTotal", "WhiteTotal"]].idxmax(axis=1)
fig, ax = plt.subplots()
for race, color in race_colors.items():
subset = psChar_23[psChar_23["PredominantRace"] == race]
x = subset["LunchRate"]
y = subset["StudentTeacherRatio"]
scale = 200.0 * np.random.rand(len(subset))
ax.scatter(x, y, c=color, s=size_values[race], label=race, alpha=0.3, edgecolors='none')
ax.legend(('Predominately Black School', 'Predominately Latino/Hispanic School', 'Predominately White School'), loc='upper right', shadow=True)
ax.grid(True)
ax.set_xlabel("% of Students w/ FRPL Eligibility")
ax.set_ylabel("Students per Teacher")
ax.set_title("FRPL Eligibility & Student-Teacher Ratio (by Race)")
ax.set_xlim(0, 100)
ax.set_ylim(0, 50)
ax.set_ymargin(0.1)
ax.set_xmargin(0.1)
plt.show()
Hispanic/Latino Students¶
race_colors = {"BlackTotal": "tab:blue", "HispanicTotal": "tab:orange", "WhiteTotal": "tab:green"}
size_values = {"BlackTotal": 0, "HispanicTotal": 45, "WhiteTotal": 0}
psChar_23["PredominantRace"] = psChar_23[["BlackTotal", "HispanicTotal", "WhiteTotal"]].idxmax(axis=1)
fig, ax = plt.subplots()
for race, color in race_colors.items():
subset = psChar_23[psChar_23["PredominantRace"] == race]
x = subset["LunchRate"]
y = subset["StudentTeacherRatio"]
scale = 200.0 * np.random.rand(len(subset))
ax.scatter(x, y, c=color, s=size_values[race], label=race, alpha=0.3, edgecolors='none')
ax.grid(True)
ax.set_xlabel("% of Students w/ FRPL Eligibility")
ax.set_ylabel("Students per Teacher")
ax.set_title("FRPL Eligibility & Student-Teacher Ratio (Hispanic Students)")
ax.set_xlim(0, 100)
ax.set_ylim(0, 50)
ax.set_ymargin(0.1)
ax.set_xmargin(0.1)
plt.show()
Black Students¶
race_colors = {"BlackTotal": "tab:blue", "HispanicTotal": "tab:orange", "WhiteTotal": "tab:green"}
size_values = {"BlackTotal": 45, "HispanicTotal": 0, "WhiteTotal": 0}
psChar_23["PredominantRace"] = psChar_23[["BlackTotal", "HispanicTotal", "WhiteTotal"]].idxmax(axis=1)
fig, ax = plt.subplots()
for race, color in race_colors.items():
subset = psChar_23[psChar_23["PredominantRace"] == race]
x = subset["LunchRate"]
y = subset["StudentTeacherRatio"]
scale = 200.0 * np.random.rand(len(subset))
ax.scatter(x, y, c=color, s=size_values[race], label=race, alpha=0.3, edgecolors='none')
ax.grid(True)
ax.set_xlabel("% of Students w/ FRPL Eligibility")
ax.set_ylabel("Students per Teacher")
ax.set_title("FRPL Eligibility & Student-Teacher Ratio (Black Students)")
ax.set_xlim(0, 100)
ax.set_ylim(0, 50)
ax.set_ymargin(0.1)
ax.set_xmargin(0.1)
plt.show()
White Students¶
race_colors = {"BlackTotal": "tab:blue", "HispanicTotal": "tab:orange", "WhiteTotal": "tab:green"}
size_values = {"BlackTotal": 0, "HispanicTotal": 0, "WhiteTotal": 45}
psChar_23["PredominantRace"] = psChar_23[["BlackTotal", "HispanicTotal", "WhiteTotal"]].idxmax(axis=1)
fig, ax = plt.subplots()
for race, color in race_colors.items():
subset = psChar_23[psChar_23["PredominantRace"] == race]
x = subset["LunchRate"]
y = subset["StudentTeacherRatio"]
scale = 200.0 * np.random.rand(len(subset))
ax.scatter(x, y, c=color, s=size_values[race], label=race, alpha=0.3, edgecolors='none')
ax.grid(True)
ax.set_xlabel("% of Students w/ FRPL Eligibility")
ax.set_ylabel("Students per Teacher")
ax.set_title("FRPL Eligibility & Student-Teacher Ratio (White Students)")
ax.set_xlim(0, 100)
ax.set_ylim(0, 50)
ax.set_ymargin(0.1)
ax.set_xmargin(0.1)
plt.show()
Bubble Plot¶
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
import math
sample_size = 1000
sample = psChar_23.sample(n=sample_size, random_state=1)
hover_text = []
bubble_size = []
for index, row in sample.iterrows():
hover_text.append(('School: {SchoolName}<br>'+
'Lunch Rate: {LunchRate:.2f}<br>'+
'Students per Teacher: {StudentTeacherRatio}<br>'+
'Total Enrollment: {TotalEnrollment}<br>').format(SchoolName=row['SchoolName'],
LunchRate=row['LunchRate'],
StudentTeacherRatio=row['StudentTeacherRatio'],
TotalEnrollment=row['TotalEnrollment']))
bubble_size.append(math.sqrt(row['TotalEnrollment']))
sample['text'] = hover_text
sample['size'] = bubble_size
sizeref = 2.*max(sample['size'])/(25**2)
race_categories = ['BlackTotal', 'HispanicTotal', 'WhiteTotal']
race_data = {race: sample[sample["PredominantRace"] == race] for race in race_categories}
fig = go.Figure()
for race, subset in race_data.items():
fig.add_trace(go.Scatter(
x=subset["LunchRate"],
y=subset["StudentTeacherRatio"],
name=race,
text=subset["text"],
marker_size=subset['size'],
))
fig.update_traces(mode='markers', marker=dict(sizemode='area',
sizeref=sizeref, line_width=2))
fig.update_layout(
title="FRPL Eligibility & Student-Teacher Ratio",
xaxis=dict(title="% of Students w/ FRPL Eligibility", gridcolor='white', gridwidth=2),
yaxis=dict(title="Students per Teacher", gridcolor='white', gridwidth=2, range=[0, 50],
dtick=20),
paper_bgcolor='rgb(243, 243, 243)',
plot_bgcolor='rgb(243, 243, 243)',
)
fig.show()
Executive Summary¶
Introduction¶
Each year, the Aspen Institute releases the State of Play report, an analysis capturing sports participation data on youth in the United States. In the most recent report released in 2024, key findings suggest a connection between income and sports participation.
According to the National Survey of Children's Health (NSCH), Vermont, Iowa, North Dakota, Wyoming, Maine, South Dakota, and New Hampshire reported the highest percentage of youth sports participation (over 63%). Alternatively, states such as New Mexico, Nevada, Mississippi, and Louisiana reported below average youth sports participation (under 54%). Excluding Nevada, the remaining states are among the poorest states in the country, and have larger minority populations -- while many of the states with the highest percentages have low minority populations.
For instance, through data sourced from the Sports & Fitness Industry Association (SFIA) and their Sports Marketing Surveys (SMS), it was found that the sports particpation rate in Black youth aged 6-17 declined from 45% to 35% over a ten-year period (2013-2023).
It is important to note that sports participation rates in general have been steadily returning to how they were pre-COVID; however, the trends in youth sports participation post-COVID are shown differently among various demographics. For instance, youth aged 6-12 coming from lower-income families (under $25,000) were the only demographic that declined in sports participation rates from 2022 to 2023, while every other income bracket increased.
Purpose¶
Although more youth are returning to sports post-COVID, opportunities for sports participation seems to be heavily dependent upon various factors such as socioeconomic status, accesibility, and resource allocation. In public schools, disparities in funding can limit access to appropriate facilities, personnel, or physical education, which could hinder opportunities to participate in sports or physical activity. Outside of school, high-costs of specialized or club teams, or lack of accessibility to recreational facilities can also be barriers, especially for lower income families.
Facility Quality¶
In a 2021 report written by the 21st Century School Fund, a non-profit dedicated towards improving public school facilities, numerous inequities were found in public school funding due to racial, socioeconomic, and geographic factors. For example, rural school districts with lower-income public schools received about 2.3 million dollars per school to make capital improvements on facilities and buildings; however the average is 4.3 million dollars per school, meaning that those rural low-income schools were only receiving half as much as the national average.
Physical Education¶
A study done by researchers for the Bridging the Gap program found that only 43% of students at Black public elementary schools received the recommended 20+ minutes of recess time, and 55% of students at predominately Hispanic/Latino schools. These percentages, when compared to the 77% of students at predominately White schools, suggest a correlation between race and disparities in physical activity offered to students through recess time. In another study conducted to examine physical activity among middle school students, researchers found a significant relationship between schools with a higher amount of students using the free/reduced lunch program and having less environmental access for physical activity.
Specialized Sports¶
Many parents sign their children up for specialized sports -- year-round training and competition in the form of AAU or club teams -- in order to increase scouting opportunities and assist with skill/performance development. Sports specialization typically requires costly investments towards participation, travel, and equipment fees, which can create financial barriers, especially in high-cost sports such as tennis, gymnastics, and ice hockey.
Methodology¶
Public School Characteristics 2022-23
Last Updated: October 21, 2024
https://catalog.data.gov/dataset/public-school-characteristics-2022-23-451db
The National Center for Education Statistics (NCES) gathers demographic and geographic data about U.S public schools and factors such as enrollment and Title I status. The variables that will be analyzed are those regarding free/reduced-price lunch (FRPL) rates, class size, and student demographics.
The uncleaned dataset had 101,390 rows of data, with 77 columns. The first step was to address any missing values in the dataset, specifically in the FRPL columns. As indicated in the description of this dataset online, these missing values are represented by a number of indicators: -1 indicates that data is missing, -2 or N indicates that data is not applicable, and -9 indicates that data did not meet NCES data quality standards. Because these values were negative, rows with these negative values were dropped. The next step was to make sure all of the schools being analyzed were 1) operational, so schools that were reported as "School temporarily closed" or "School to be operational within two years" in the 'Status' column of the dataset were removed. Similarly, this analysis is focused on traditional schools, so schools in the "SchoolType" column reported as alternative, special education, or career and technical schools were also removed. Additionally, columns unecessary to the analysis such as administrative information -- StaffFTE, Phone, LEA Name, LEA ID, etc. were removed as well. Lastly, since this analysis is focused on K-12 students, the columns 'Adult Ed' and 'Grade 13' were removed.
Descriptive statistics were then used to find key information about the variables being analyzed such as minimum and maximum values, mean and median, quartile ranges, and standard deviation.
For visualizations, scatter plots and a bubble plot were chosen. These were chosen since because the main variables for analysis were both numerical and could be used to visualize any clear correlations between the two. Additionally, these visualizations were also color coded based on the predominant race (either Black, Hispanic/Latino, or White) of each school. Four scatterplots were generated; one with all of the races, and one for each of the races being analyzed. One bubble plot was generated using a sample size of 1000 schools which consisted of all analyzed races in one plot.
Analysis¶
According to the National Center for Education Statistics (NCES), the following percentages are used to determine low-poverty and high-poverty schools:
- Low-poverty: 25% or less of students are FRPL eligible
- Mid-low poverty: 25.1% to 50% of students are FRPL eligible
- Mid-high povety: 50.1% to 75% of students are FRPL eligible
- High-Poverty: 75% or more of students are FRPL eligible
For the analysis, FRPL values were used as an indicator of income, while student-to-teacher ratio was used as an indicator of resource allocation. Further research might find that school locale (urban, suburban, town, rural) could also be factors that correlate with these variables, but they were not considered for this analysis.
FRPL¶
The first observations are the mean values for students receiving free lunch, and those receiving reduced lunch. The mean value of students receiving free lunch is around 294.01 students, while reduced is around 35.52 students. Because significantly more students qualifying for FRPL are in need of the full free lunch on average, this could be a possible indicator of a lower-income school. The standard deviation (std) for the free lunch column is lower than the mean, while the std for the reduced price lunch column is higher, suggesting more variability in the number of students receiving reduced lunch. This is confirmed when looking at the quartile ranges; from 25% to 75% (5-45 students), the values suggest that while not many students recieve reduced-price lunch per school, some schools have up to 1400 students receiving it, which is higher than average. Through these statistics, it can be inferred that there are a small amount of schools that have a disproportionate amount of students receiving FRPL, and they might be overlooked if just looking at the mean itself.
Enrollment Trends¶
The mean values for enrollment show a clear drop in enrollment from grade 9 to 12, starting at a mean of around 223.76 students (grade 9) per school to a mean of around 191.25 students (grade 12) per school. Grade 9 has a larger std than the mean, with a max value of 6251 students. Grade 12 also has a larger std than the mean, but there is only a .6 difference.
Class Size¶
By looking at the descriptive statistics for the student-to-teacher ratio, the values seem consistent, as the std at around 13.33 students/teacher is lower than the mean of around 17.14 students/teacher. The 25%-75% ranges are also consistent (13.49-20.00); however, the max value is 1860 students/teacher, indicating either an errror in the data that was entered or an atypical class size.
Race/Ethnicity of Students¶
The mean for total black students is around 89.30 students per school, yet the quartile ranges from 3-106 students; however, the max value goes up to 4402. For Hispanic/Latino students, the mean is around 177.29 students, with quartile ranges from 22-233; the max value goes up to 4065. The mean for white students is around 236.78 students, with quartile ranges from 50 to 332 students and a max value of 4294. These statistics suggest that white students are more evenly dispersed in schools, while Black and Hispanic students vary -- most schools have little to no Black or Hispanic students, while very few have high concentrations of them.
Visualization Findings¶
Scatter Plots¶
In the scatter plot displaying all races in this analysis (Black, Hispanic, white), the first item is that the number of points representing predominately white schools is signifcantly larger than the predominately Black or Hispanic schools.
For predominately white schools, there is a larger concentration of points to the left (lower FRPL percentage). As the FRPL percentage increases, the number of students per teacher slightly decreases, but the plot itself does not show any extreme trends.
For predominately Black schools, there are significantly less points than the predominately Hispanic or white schools. On this plot, there is a much larger concentration of schools with a higher FRPL percentage, with most falling into 60-100%, and a very small amount of these schools that only have 0-40% of students eligible for FRPL.
For predominately Hispanic schools, there is also a larger concentration of schools having a larger FRPL percentage, though the points are more dispersed across the plot than those in the predominately Black schools plot.
Bubble Plot¶
The bubble plot uses a sample size of 1000 schools and displays color-coded points to show if a school is predominately white, Black or Hispanic. This plot shows similar findings to the scatter plots; a higher concentration towards lower lunch rate percentages for predominately white schools, a dispersement of points (though more concentrated as the lunch rate increases) across the plot for predominately Hispanic schools, and a lack of predominately Black schools with points that trend towards higher lunch rate percentages.
Recommendations¶
Based on this analysis, key findings indicate inequities within the U.S public school system that could be improved through policy and intervention. The first is to make an effort to intentionally identify schools with a disproportinate amount of students receiving FRPL, rather than allow them to be overlooked by only considering mean values, especially when the data includes significant outliers. The second issue is the uneven dispersement of Black and Hispanic/Latino students in schools that highlights racial and socioeconomic issues. This could be addressed by identifying any barriers predominately Black or Hispanic schools might face, and by prioritizing resource allocation to schools with a large concentration of Black and Hispanic students in high poverty areas.
Conclusion¶
Though this research sought to explore how school funding and resource allocation could affect youth sports particpation, this analysis also highlighted broad racial and socioeconomic disparities in the U.S public school system. By using childrens' accessibility and opportunity for sports particpation as an example, the findings from this analysis point to a broader issue -- systemic inequities that can affect students' experiences and opportunities. To address these inequities, policymakers should be intentional and take notice of marginalized and underresourced communities to ensure all students and school districts from receive equitable opportunity and treatment.
References¶
Aspen Institute. (2023). State of Play 2023. Project Play. https://projectplay.org/state-of-play-2023/participation
Carlson, J. A., Mignano, A. M., Norman, G. J., McKenzie, T. L., Kerr, J., Arredondo, E. M., Madanat, H., Cain, K. L., Elder, J. P., Saelens, B. E., & Sallis, J. F. (2014). Socioeconomic disparities in elementary school practices and children's physical activity during school. American journal of health promotion : AJHP, 28(3 Suppl), S47–S53. https://doi.org/10.4278/ajhp.130430-QUAN-206
Young, D. R., Felton, G. M., Grieser, M., Elder, J. P., Johnson, C., Lee, J. S., & Kubik, M. Y. (2007). Policies and opportunities for physical activity in middle school environments. The Journal of school health, 77(1), 41–47. https://doi.org/10.1111/j.1746-1561.2007.00161.x